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Abstract — This paper describes a low-complexity approach for 
reconstructing average packet arrival rates and instantaneous 
packet counts at a router in a communication network, where 
the arrivals of packets in each flow follow a Poisson process. 
Assuming that the rate vector of this Poisson process is sparse 
or approximately sparse, the goal is to maintain a compressed 
summary of the process sample paths using a small number of 
counters, such that at any time it is possible to reconstruct both 
the total number of packets in each flow and the underlying rate 
vector. We show that these tasks can be accomplished efficiently 
and accurately using compressed sensing with expander graphs. 
In particular, the compressive counts are a Hnear transformation 
of the underlying counting process by the adjacency matrix of an 
unbalanced expander. Such a matrix is binary and sparse, which 
allows for efficient incrementing when new packets arrive. We 
describe, analyze, and compare two methods that can be used to 
estimate both the current vector of total packet counts and the 
underlying vector of arrival rates. 

I. Introduction 

Successful management of large-scale communication net- 
works rests crucially on the availability of accurate traf- 
fic measurements. From the viewpoint of such tasks as 
billing/accounting or intrusion detection, network traffic is 
composed of packet flows (or streams) arriving at or departing 
from routers in the network. As both the number of users and 
the data rates continue growing, there is increasing emphasis 
on traffic measurement architectures that are accurate, fast, 
and cheap. Naturally, some trade-offs between these three 
desiderata are inevitable. For instance, one could keep a 
dedicated counter for each flow, but, depending on the type of 
memory used, one could end up with an implementation that 
is either fast but expensive and unable to keep track of a large 
number of flows (e.g., using SRAMs, which have low access 
times, but are expensive and physically large) or cheap and 
high-density but slow (e.g., using DRAMs, which are cheap 
and small, but have longer access times). 

Recent work has shown that a reasonable compromise 
between accuracy, speed and cost can be found if one takes 
into account certain prior knowledge about the relative flow 
sizes in a typical network. In particular, there is empirical 
evidence [1], [2] that flow sizes in IP networks follow a 
heavy-tail pattern: just a few flows (say, 10%) carry most of 
the traffic (say, 90%). Based on this observation, Estan and 
Varghese [3] proposed two methodologies ("sample-and-hold" 
and "multistage filters") that use a small number of counters 
to keep track only of the flows whose sizes exceed a given 



fraction of the total bandwidth. More recently, Lu et al. [4] 
developed a new technique, termed "Counter Braids," which 
uses sparse random graphs to aggregate (or "braid") the raw 
packet counts into a small number of counters. The total size of 
each flow can then be recovered at the end of a measurement 
epoch using a message passing decoder. 

The approach of Estan and Varghese [3] allows one to keep 
track only of the few heavy flows while ignoring the rest 
("focusing on the elephants, ignoring the mice," as they put it), 
while Lu et al. [4] can recover the entire vector of flow sizes. 
Moreover, these two approaches rely on different modeUng 
assumptions. Specifically, in [3] the flow sizes are assumed to 
be deterministic and subject to the heavy-tail behavior, while 
in [4] the flow sizes are i.i.d. realizations of a random variable 
with a heavy-tail distribution. 

A. Our contribution 

The present paper considers a more reaUstic setting where 
each flow (or stream) is modeled as a Poisson process with 
an unknown rate (measured in packets per unit time), and 
it is the rates corresponding to the streams at a given router 
that possess the heavy-tail property. This modeling assumption 
combines certain aspects of [3] and [4]: the heavy-tail property 
is present both on the level of coarse-grained, time-averaged 
behavior of the flows and on the level of actual traffic patterns, 
which are stochastic. Moreover, our model goes beyond the 
i.i.d. assumption of [4] and can account for the heterogeneous 
nature of the different flows entering a particular router 

The main goal is to reconstruct the underlying vector of 
rates while maintaining a small number of counters with low 
access times. To accomplish this goal, we exploit our recent 
work [5] on compressed sensing (CS) with Poisson-distributed 
observations. Mathematically, the heavy-tail property can be 
restated as follows: the vector of rates is, to a good approx- 
imation, sparse. This sparsity interpretation strongly suggests 
that CS can be used to accurately recover the underlying 
vector of rates from a small number of judiciously designed 
linear transformations of the observed flows. Building on the 
results from [5], we show that the raw packet counts can be 
mapped into a small number of "compressed" counts using 
the adjacency matrix of a properly constructed unbalanced 
expander. Such an adjacency matrix has binary entries and is 
sparse (i.e., each column has a small constant number of ones), 
which ensures that the counts can be updated using a small 



number of operations as new packets arrive. The resulting 
architecture can be used to recover the raw packet counts 
as well. Since we are dealing here with Poisson streams, we 
would like to push the metaphor further and say that we are 
"focusing on the whales, ignoring the minnows." 

We analyze the performance of our scheme theoretically, 
describe an efficient implementation, and present preUminary 
experimental results. 

B. Notation 

Given a vector u € and a set C {1, . . . ,m}, we 
will denote by the vector obtained by setting to zero all 
coordinates of u that are in S"^, the complement of S: VI < 
i < m,uf = Uil^i^s}- Given some 1 < fc < m, let 5 be the 
set of positions of the k largest (in magnitude) coordinates of 
u. Then u*^*'^ = u'' will denote the best k-tenn approximation 
of u (in any norm on M™), and 

a,(u)^||u-u('=)||i= ^ \ui\ 

will denote the resulting ii approximation error. The £o 
quasinorm measures the number of nonzero coordinates of 
u: \\u\\o — J27Li l{iii^o}- Given a vector it, we will denote 
by w+ the vector obtained by setting to zero all negative 
components of u: for all 1 < i < m, = max{0, 

II. Problem formulation 

We wish to monitor a large number N of packet flows 
using a much smaller number M of counters. Each flow is 
a homogeneous Poisson process. Specifically, let A* e 
denote the vector of rates, and let U denote the random process 
U = {Ut}teR+ with sample paths in Z^, where for any 
t e ]R+ and any k G we have 



N 



Pv(t/t=fc)=n4t-^"*''*- 
i=i 

In other words, for each i G {1,. . . , N}, the ith component of 
U, which we will denote by U^^\ is a homogeneous Poisson 
process with the rate of arrivals per unit time, and all the 
f^'^'s are mutually conditionally independent given A*. 

The counters are updated in discrete time, every r time 
units. Let X = {Xn}nei.+ denote the sampled version of U, 
where -X"„ = Unr- The update takes place as follows. We 
have a binary matrix A G {0,1}*'^^^, and at each time n let 
Yn = AXn- The probabihstic law governing the evolution 
of the counter contents is now 



VA*(^n = ^) = n 



O -(utAX )j 



where the case {AX*)j = for any 1 < j < M is 
handled using the fact that \^ e^^ /£\ l{*^=o} as A ^ 0. 
In other words, 1^ is a sampled version of an M-dimensional 
homogeneous Poisson process with the rate vector AA*. 

The goal is to estimate the unknown rate vector A* after n 
time steps given y" = (Yi, ■ ■ ■ , Yn) using an estimator A„ 



based on Y"": A„ = Xn{Y"'). We measure the quality of such 
an estimator by the expected ii risk: 



R 



(a„,a*) 



A„ — A 



■ JV 

E 

.i=l 



^n,i ~ A* 



where the expectation is taken w.r.t. the underlying flow 
process X ~ P;^*. Assuming that the unknown rate vector 
A* is a member of a given class A*, we would like to design 
the counter update matrix A and an accompanying sequence 
of estimators {A„} to attain low risk i?^A„,A*j over A*. 
One particular class of interest, which pertains to the heavy- 
tail behavior of network traffic, is defined by 

Sa.Lo = {AeK^^ : ||A||i <io;c7fc(A) <iofc-",Vfc} (1) 

for some Lq > Q and a > 1. Here, a is the power-law 
exponent that controls the tail behavior; in particular, the 
extreme regime a — > +oo describes the fully sparse setting. 

III. Preliminaries 

As we show in the sequel, a good choice for the counter up- 
date matrix A is the adjacency matrix of a suitably constructed 
expander. Adjacency matrices of high-quality expanders have 
been proposed as an alternative to dense, random measurement 
matrices for sparse signal recovery [6]-[10]. This section 
summarizes the key results on expanders, as well as the results 
from our earher work [5] on the use of expanders for sparse 
recovery under the Poisson observation model. 

A. Expanders and sparse recovery 

Definition 1. A (fc, e)-unbalanced expander, or simply a {k, e)- 
expander, is a bipartite simple graph G = {A, B, E) with left 
degree d, such that for any S C A with \S\ < k, the set of 
neighbors Af^S) of S has size \AfiS)\ > (1 - e)d\S\. 

Here, A (resp., B) corresponds to the components of the 
original signal (resp., its compressed representation). Hence, 
for a given \A\, a "high-quaUty" expander should have \B\, 
d, and e as small as possible, while k should be as close as 
possible to \B\. The following proposition (cf. [5], [6]) tells 
us what we can expect: 

Proposition 1. For any 1 < k < N/2 and any s G (0, 1), 
there exists a {k, e)-expander with left set size N, left degree 
d = (M^) and right set size M = (Ms^^W 

From this point on, given N and 1 < A; < N/2, we will denote 
by Gk,N a fixed expander with e = 1/16 whose existence is 
guaranteed by the above proposition (the value of e is fixed 
for convenience). The following proposition is key to the use 
of expanders for sparse recovery: 

Proposition 2. Let # = A/d be the normalized adjacency 
matrix of Gk,N, and let u, v be two vectors in M^, such that 
\\u\\i > \\v\\i — A for some A > 0. Then 

\\u - v\\i < 4cTfc(u) + 4||*M - ^v\\i + 2A. 



For future reference, we note that, since our expander is 
regular, there exists a minimal set f2 C A of size M, such 
that its neighborhood covers all of B, i.e., N{^) = B. Let 
Jq e be the vector with components /n_i = l^jgnj. In 
that case, note that $Jo ^ iMxi/d. 

B. Expander-hased CS under the Poisson model 

In [5], we have considered the following problem: Let 6* G 
'R-^ be an unknown vector of Poisson intensities with known 
1 1 norm = L (in general, L may be a known upper 

bound on Given a fixed 1 < fc < N/2, let * be the 

normalized adjacency matrix of Gk,N- We observe a random 
vector Z e "L^ distributed according to Z ^ Poisson($0'^). 

Let Q C R^ be a finite or countable set of candidate 
estimators of 0* such that \\9\\i < L,\/6 e 6, and for a 
given c > define the set 

Tc=[f = e + cLIa:ee@]. 

Moreover, let pen(-) : © — )• R+ be a penalty (or regulariza- 
tion) functional satisfying the Kraft inequality, 

See 

Since there is a one-to-one correspondence between 6 and Tc, 
we will overload our notation and let pen(/) denote pen(0) 
whenever / = 6+cLIq. In [5], we have shown the following: 

Proposition 3. Consider the penalized maximum likelihood 
estimator (pMLE) 

/ ^ argmin [- log P*/ (Z) + 2pen(/)] (2a) 
e^f- cLIn. (2b) 

Then 



E\\e* - d\\i = O ak{0*r + c{MLY{2d + Mc) 



+ 




mm 



161* -6> 



2 ^ Lcpen(0) 



■ (3) 



Prop. 3 effectively states that the squared £i error of G scales 
with M times the best penalized (.\ approximation error plus 
the fc-term approximation error of 6* . The first term in (3) is 
smaller for sparser 6*, and the second term is smaller when 
there is a which is simultaneously a good approximation to 
G* and has a low penalty. 

IV. Two ESTIMATION STRATEGIES 

We consider two estimation strategies. In both cases, we 
let our measurement matrix A be the adjacency matrix of a 
Gk,N for a fixed k < N/2. The first strategy, which we call 
the direct method, uses expander-based CS to first recover an 
estimate of Xn from Yn, then constructs an estimate of A*. 
The second strategy, which we call the penalized MLE strategy 
(or pMLE), relies on the Poisson CS machinery presented in 
Section IIl-B and can be used when only the rates are of 



interest. One benefit of pMLE compared to the direct method 
is its low complexity, which is derived from a preprocessing 
step based on the structure of the underlying expander Gk^N- 

A. The direct method 

The firs^approach is to use expander-based CS to obtain an 
estimate X„ of X„ from Yn = AXn, followed by letting 

(4) 



\dir 

A„ — 



nr 

This strategy is based on the observation that X„/(nT) is 
the maximum-likelihood estimator of A*, and will serve as a 
"baseUne" against which the penaUzed MLE will be compared. 
To obtain X„, we need to solve the convex program 

minimize ||m||i subject to Au = Yn 

which can be cast as a linear program [6]. The resulting 
solution Xn may have negative coordinates', hence the use of 
the (•)+ operation in (4). We then have the following result: 



Theorem 4. 



ll(A*) 



1/21, 



riT 



(5) 



where (A*)^/^ is the vector with components ^J~Xf,\/i. 



Proof: We first observe that, by construction, X„ satisfies 
the relations AXn = AXn and H-ynlli < ||X„||i. Hence, 

E||X„ - nr A*||i < E||X„ -Xn\\i+ E|1X„ - nr A*||i 
<4Eafe(A:„)+E||X„-nrA*||i (6) 

where the first step uses the triangle inequaUty, while the 
second step uses Proposition 2 with A = 0. To bound the 
first term in (6), let S C {1, . . . , N} denote the positions of 
the k largest entries of A*. Then, by definition of the best 
fc-term representation. 



ak{Xn)<\\Xn-X% 

Therefore, 



ECTfc(X„) < E 



= nr ^ A* = nrakiX*). 



To bound the second term, we can use concavity of the square 
root, as well as the fact that each Xn,i ^ Poisson(nTA*), to 
write 



JV 



i=l 



N 

E\\Xn - nr A*||i < ^ ^JE{Xn,i - nrX*)^ = ^ y/^] 

1=1 



Now, it is not hard to show that | 
nrA*||i. Therefore, 

E|IX„ -nr A*||i 



nrA* 



R 



(Ar,A*) 



< 



nr 



< 4(7fe(A*) + 



(A*) 



1/2 I 



nr 



which proves the theorem. 



'Khajehnejad et al. [9] have recently proposed the use of perturbed 
adjacency matrices of expanders to recover nonnegative sparse signals. 



B. The penalized MLE approach 

The second approach is based on the penaHzed MLE 
framework. Assume that we know a good upper bound Lq 
on the total average arrival rate ||A*||i. Let A be a sufficiently 
large finite set of candidate estimators with l|Al|i < Lq for 
all A G A, and let pen(-) be a penalty functional satisfying 
the Kraft inequaUty over A. Given n and r, let A„,^ = nrrfA 
with the same penalty function. 

We can now apply the results of Section 111-B with Z = 
and 6* = nrdX*. With this notation, define 



^pMLE 



e 

nrd^ 



where 9 is the corresponding pMLE estimator. Then we have 
the following risk bound: 

Theorem 5. Let c = ^ iog(jv/fc) ' ^here j > is chosen so 
that c <C 1. Then 

^pMLE 



R (xl ,X*)=0 (afc(A*) + ^/^log(iV/fc)) 



+ 0(log(7VA)W-min 



||A^-A||? + 



pen(A) 



riT 



(7) 



We now develop risk bounds under the heavy-tail condition. 
To this end, let us suppose that A"* is a member of the heavy- 
tail class SLo,a defined in (1). Fix a small positive number 6, 
such that Lq / VS is an integer, and define the set 

A ^ {a e R^^ : ||A||i < Lo;Xi e {m^/5}iif 

These will be our candidate estimators of A*. We can define 

the penalty function pcn(A) x || A||o log(5~-'^) so that it 
satisfies Kraft's inequality. Moreover, if 6 is small enough, 
for any A e ^a,Lo and any 1 < m < N we will be able to 
find some A^"*' e A, such that ||A||o m and 



IIA- A 



(m)||2 



m 



-2a 



+ mS. 



We will also assume that 5 is sufficiently small, so that the 
penalty term — - donoinates the quantization error mS. 
Thus, we can bound the minimum over A e A in (7) from 
above by 



mm 

l<m<Ar 



m 



-2a 



mlog((5 ^) 



TIT 



^log(5-l)^ ^°+^ 



riT 



Using O(-) notation to hide factors that are logarithmic in N 
and k, we can particularize Theorem 5 to the heavy-tail case: 



Theorem 6. 



sup 



/^pMLE A 




0{k^°') + 



Note that the risk bound here is worse than the benchmark 
bound of Theorem 4. However, in order to compute the direct 



estimator one has to solve a linear program, whereas, as we 
show next, the pMLE can be approximated very efficiently 
with proper preprocessing of the observed counts Yn based 
on the structure of Gk,N- 

V. Efficient pMLE Approximation 
In this section we present an efficient algorithm for approx- 
imating the pMLE estimate. The algorithm consists of two 
phases: (1) first, we preprocess Yn to isolate a subset A\ of 
^ = {1, . . . , N} which is sufficiently small and is guaranteed 
to contain the locations of the k largest entries of A* (the 
whales); (2) then we construct a set A of candidate estimators 
whose support sets lie in Ai, together with an appropriate 
penalty, and perform pMLE over this reduced set. 

The success of this approach hinges on the assumption that 
the magnitude of the smallest whale is much larger compared 
to the total contribution of the minnows. Specifically, we make 
the following assumption: Let S d A contain the locations of 
the k largest coordinates of A*. Then we require that 

minxes 



d 



> (Tk{X*). 



(8) 



One way to think about (8) is in terms of a signal-to-noise 
ratio, which must be strictly larger than the left degree d of 
the underlying expander [recall that d = 0{\og{N /k))]. We 
also perturb our expander a bit as follows: choose an integer 
fc' > so that 

i5^>M+l. (9) 
16 

Then we replace our original (fc, l/16)-expander with left- 
degree d with a (fc', l/16)-expander with the same left degree. 
The resulting procedure, displayed below as Algorithm 1, has 
the following guarantees: 

Algorithm 1 Efficient pMLE approximation algorithm 
Input: Measurement vector Yn, and the sensing matrix A. 
Output: An approximation A 

Let Bi consist of the locations of the kd largest elements 

of Yn and let B2 = B\Bi. 

Let A2 contain the set of all variable nodes that have at 

least one neighbor in B2 and let Ai = A\A2. 

Construct a candidate set of estimators A with support in 

Ai and a penalty pcn(-) over A. 

Output argmiiiAeA [- logP„TAA(^n) + 2pen(A)]. 



Theorem 7. The set Ai constructed by Algorithm 1 has the 
following properties: (]) S C Ai; (2) \Ai\ < kd; (3) Ai can 
be found in time 0{Nd) = 0(A^log(iV/fc)). 

Proof: (1) If we decompose X„ as X„ -f e, then Y„ = 
AXf + Ae. Since each column of A is d-sparse and 
is fc-sparse 



AX'n is fcd-sparse. On the other hand, Yn = 
Y"^ + Yn^ , where, by construction, is the best fcd-term 
approximation of Yn- Hence, 



lYn - Yi 



< 



lAe|li<d||e|| 



(10) 



where the last inequality follows from the properties of A. 
Now, since only the nodes in have neighbors in i?2, 

ll^n^lll = E E ^^^r.. > E ^n,i = WX^^h- dD 

jeB2ieA2 ieA2 

Combining (10) and (11), we get the bound < d||e||i. 

Taking expectation of both sides, we obtain 

E||X;f^||i <d-E\\e\\i=d-Eak{Xr,) < dnrakiX*), (12) 

where the last step follows the same reasoning as in the proof 
of Theorem 4. Now suppose that 5 fl A2 ^ 0. Then 

E||X;^^||i > E||Xf^^=||i > nrminA* > dnrakiX*), 

where the last step follows from (8). Since (12) must also 
hold, we arrive at a contradiction, and therefore S C Ai. 

(2) Suppose, to the contrary, that |Ai| > kd. Let A[ C Ai 
be any subset of size kd+1. Now, Lemma 3.6 in [9] states 
that, provided e < 1 — 1 /d, then every {£, e)-expander with left 
degree d is also a {£{l—e)d, 1 — l/(i)-expander with left degree 
d. We apply this result to our (fc', l/16)-expander, where k' 
satisfies (9), to see that it is also a {kd+ 1,1 — l/d) -expander. 
Therefore, for the set A[ we must have |A/'(Ai)| > \A{\ = 
kd+1. On the other hand, Af{A[) c Bi, so \Af{A[)\ < kd. 
This is a contradiction, hence we must have |^i| < kd. 

(3) Finding the sets Bi and B2 can be done in 0{M log M) 
time by sorting Yn. The set Ai can then can be found in 
time 0{Nd), by sequentially ehminating all nodes connected 
to each node in ^2- ■ 

Having identified the set Ai, we can reduce the pMLE 
optimization only to those candidates whose support sets lie 
in Ai. More precisely, if we originally start with a sufficiently 
rich class of estimators A, then the new feasible set can be 
reduced to 

A^{AeA:Supp(A)cAi}. 

Hence, by extracting the set Ai, we can significantly reduce 
the complexity of finding the pMLE estimate. If |A| is small, 
the optimization can be performed by brute-force search in 
0(1 A|) time. Otherwise, since |Ai| < kd, we can use the 
quantization technique from the preceding section with quan- 
tizer resolution VS to construct a A of size at most (Lq / VSY'^. 
In this case, we can even assign the uniform penalty 

pen(A) = log |A| = O {k\og{N/k) \og{5-^)) , 

which amounts to a vanilla MLE over A. 

VI. Experimental Results 

Here we compare penalized MLE with £1 -magic [11], a 
universal ii minimization method, and with SSMP [10], an 
alternative method that employs combinatorial optimization, 
i'l-magic and SSMP both compute the "direct" estimator by 
solving a convex program. The pMLE estimate is computed 
using Algorithm 1 above and the Sparse Poisson Intensity 
Reconstruction ALgorithm (SPIRAL) [12] for reconstruction 
of sparse signals from indirect Poisson measurements. 



Figures 1(a) through 2(c) report the result of numerical 
experiments, where the goal is to identify the k largest entries 
in the rate vector from the measured data. The set of k largest 
entries (the whales) is chosen at random. Since a random graph 
is, with overwhelming probability, an expander graph, each 
experiment was repeated 30 times ^. 

Given a particular relative sizing of whales and minnows. 
Figure 1(a) reports values of k where recovery is possible 
with generic ii algorithms (^i-magic) but not with SSMP or 
pMLE. As k increases the first algorithm to fail is SSMP, 
and the probability of successful recovery falls more sharply 
than for pMLE. We also report the relative £1 error (||A — 
•^n||i/||A — A'^'^-'IIi) as a function of k in Figure 1(b). However 
the complexity of -magic is 2— 3 orders of magnitude greater 
than pMLE, as shown in Figure 1(c). 

The effect of increased variability in the size of whales 
is to reduce the value of k at which pMLE fails. The size 
of minnows in Figure 2 is the same as in Figure 1, but the 
variation in the size of whales is determined by an Af{0, 1) 
Gaussian random variable. Here we see that still Algorithm 1 
combined with SPIRAL is two order of magnitudes faster, but 
the probability of success drops substantially for k > 80. 

VII. Conclusions 

The compressed sensing algorithms based on Poisson ob- 
servations and expander-graph sensing matrices provide a 
useful mechanism for accurately and efficiently estimating a 
collection of flow rates with relatively few counters. These 
techniques have the potential to significantly reduce the cost 
of hardware required for flow rate estimation. While previous 
approaches assumed packet counts matched the flow rates 
exactly or that flow rates were i.i.d., the approach in this paper 
accounts for the Poisson nature of packet counts with relatively 
mild assumptions about the underlying flow rates (i.e., that 
only a small fraction of them are large). 

The "direct" estimation method (in which first the vector 
of flow counts is estimated using a hnear program, and 
then the underlying flow rates are estimated using Poisson 
maximum likelihood) is juxtaposed with an "indirect" method 
(in which the flow rates are estimated in one pass from the 
compressive Poisson measurements using penalized likelihood 
estimation). The direct method can yield smaller error bounds, 
but this comes at a high computational cost relative to the 
efficient algorithms associated with the indirect method. These 
theoretical results are verified in our simulations. 

The methods in this paper, along with related results in 
this area, are designed for settings in which the flow rates are 
sufficiently stationary, so that they can be accurately estimated 
in a fixed time window. Future directions include extending 
these approaches to a more realistic setting in which the flow 
rates evolve over time. In this case, the time window over 
which packets should be counted may be relatively short, but 
this can be mitigated by exploiting estimates of the flow rates 
in earlier time windows. Another direction for future research 

^We observed similar results for experiments with larger number of trials. 




(a) Probability of successful support recovery as (b) Relative £i en'or as a function of number of (c) Average recovery time as a function of num- 
a function of number of whales k. whales k. ber of whales k. 

Fig. 1. Performance - Complexity tradeoff for £i -magic, SSMP and pMLE. The number of flows = 5000, the number of counters M = 800, and 
the number of updates n = 40. There are k whales (peaks with magnitude 1), and the remaining entries are minnows with magnitudes determined by a 
J^(0, 10~^) random variable. 




Number of whales (k) Number of wfiales [k] Number of wfiales [k] 



(a) Probability of successful support recovery as (b) Relative £i en'or as a function of number of (c) Average recovery time as a function of num- 
a function of number of whales k. whales k. her of whales k. 

Fig. 2. Performance - Complexity tradeoff for i!i-magic, SSMP and pMLE. The number of flows N = 5000, the number of counters M = 800, and the 
number of updates n = 40. There are k whales (peaks with magnitude determined by a A^(0, 1) random variable), and the remaining entries are minnows 
with magnitudes determined by a J\f{0, 10~®) random variable. 



will be to tighten the bounds for the indirect method using 
oracle inequalities based on the Kullback-Leibler divergence. 
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